Search CORE

289 research outputs found

Near-Optimal BRL using Optimistic Local Transitions

Author: Araya Mauricio
Buffet Olivier
Thomas Vincent
Publication venue
Publication date: 18/06/2012
Field of study

Model-based Bayesian Reinforcement Learning (BRL) allows a found formalization of the problem of acting optimally while facing an unknown environment, i.e., avoiding the exploration-exploitation dilemma. However, algorithms explicitly addressing BRL suffer from such a combinatorial explosion that a large body of work relies on heuristic algorithms. This paper introduces BOLT, a simple and (almost) deterministic heuristic algorithm for BRL which is optimistic about the transition function. We analyze BOLT's sample complexity, and show that under certain parameters, the algorithm is near-optimal in the Bayesian sense with high probability. Then, experimental results highlight the key differences of this method compared to previous work.Comment: ICML201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

Les POMDP font de meilleurs hackers: Tenir compte de l'incertitude dans les tests de penetration

Author: Buffet Olivier
Hoffmann Joerg
Sarraute Carlos
Publication venue
Publication date: 22/05/2012
Field of study

Penetration Testing is a methodology for assessing network security, by generating and executing possible hacking attacks. Doing so automatically allows for regular and systematic testing. A key question is how to generate the attacks. This is naturally formulated as planning under uncertainty, i.e., under incomplete knowledge about the network configuration. Previous work uses classical planning, and requires costly pre-processes reducing this uncertainty by extensive application of scanning methods. By contrast, we herein model the attack planning problem in terms of partially observable Markov decision processes (POMDP). This allows to reason about the knowledge available, and to intelligently employ scanning actions as part of the attack. As one would expect, this accurate solution does not scale. We devise a method that relies on POMDPs to find good attacks on individual machines, which are then composed into an attack on the network as a whole. This decomposition exploits network structure to the extent possible, making targeted approximations (only) where needed. Evaluating this method on a suitably adapted industrial test suite, we demonstrate its effectiveness in both runtime and solution quality.Comment: JFPDA 2012 (7\`emes Journ\'ees Francophones Planification, D\'ecision, et Apprentissage pour la conduite de syst\`emes), Nancy, Franc

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Factored planning using decomposition trees

Author: Buffet Olivier
Huang Jinbo
Kelareva Elena
Thiebaux Sylvie
Publication venue: Carnegie Mellon University
Publication date: 24/02/2016
Field of study

Improving AI planning algorithms relies on the ability to exploit the structure of the problem at hand. A promising direction is that of factored planning, where the domain is partitioned into subdomains with as little interaction as possible. Recent wor

The Australian National University

Decentralized Traffic Management: A Synchronization-Based Intersection Control

Author: Buffet Olivier
Simonin Olivier
Tlig Mohamed
Publication venue: HAL CCSD
Publication date: 01/05/2014
Field of study

International audienceControlling the vehicle traffic in large networks remains an important challenge in urban environments and transportation systems. Autonomous vehicles are today considered as a promising approach to deal with traffic control. In this paper, we propose a synchronization-based intersection control mechanism to allow the autonomous vehicle-agents to cross without stopping, i.e., in order to avoid congestions (delays) and energy loss. We decentralize the problem by managing the traffic of each intersection independently from others. We define control agents which are able to synchronize the multiple flows of vehicles in each intersection, by alternating vehicles from both directions. We present experimental results in simulation, which allow to evaluate the approach and to compare it with a traffic light strategy. These results show the important gain in terms of time and energy at an intersection and in a network

INRIA a CCSD electronic archive server

Daquar : du diagnostic acoustique d'un quartier à l'urbanité sonore

Author: Balaÿ Olivier
Buffet Valérie
Publication venue: HAL CCSD
Publication date: 01/01/2001
Field of study

L'expérience lyonnaise DAQUAR actuellement en germe peut être qualifiée d'exemplaire 1. Elle est l'occasion de montrer à une structure administrative les capacités pragmatiques d'une mise en observation sonore de la ville, non seulement pour lutter contre le bruit, mais aussi pour comprendre et aménager la diversité phonique que les habitants repèrent

Decentralized Traffic Management: A Synchronization-Based Intersection Control --- Extended Version

Author: Buffet Olivier
Simonin Olivier
Tlig Mohamed
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

Controlling the vehicle traffic in large networks remains an important challenge in urban environments and transportation systems. Autonomous vehicles are today considered as a promising approach to deal with traffic control. In this paper, we propose a synchronization-based intersection control mechanism to allow the autonomous vehicle-agents to cross without stopping, i.e., in order to avoid congestions (delays) and energy loss. We decentralize the problem by managing the traffic of each intersection independently from others. We define control agents which are able to synchronize the multiple flows of vehicles in each intersection, by alternating vehicles from both directions. We present experimental results in simulation, which allow to evaluate the approach and to compare it with a traffic light strategy. These results show the important gain in terms of time and energy at an intersection and in a network.Contrôler le trafic dans les grands réseaux reste un défi important dans les milieux urbains et les systèmes de transport. Les véhicules autonomes sont aujourd'hui considérés comme une approche prometteuse pour fluidifier le trafic. Dans cet article, nous proposons un mécanisme de contrôle d'intersection fondé sur la synchronisation pour permettre aux véhicules-agents autonomes de traverser sans s'arrêter afin d'éviter les congestions (retards) et la perte d'énergie. Nous décentralisons le problème en gérant le trafic de chaque intersection indépendamment des autres. Nous définissons des agents de contrôle qui sont capables de synchroniser les multiples flux de véhicules à chaque intersection, en alternant les véhicules des deux routes. Nous présentons des résultats expérimentaux mesurés en simulation, lesquels permettent d'évaluer l'approche et de la comparer à une stratégie plus classique basée sur les feux de circulation. Ces résultats montrent le gain important en termes de temps et d'énergie à une intersection et dans un réseau

CiteSeerX

INRIA a CCSD electronic archive server

Croisement synchronisé de flux de véhicules autonomes dans un réseau

Author: Buffet Olivier
Simonin Olivier
Tlig Mohamed
Publication venue: HAL CCSD
Publication date: 01/07/2013
Field of study

National audienceLes véhicules autonomes sont aujourd'hui considérés comme une approche prometteuse pour le transport des ressources et la régulation du trafic. Dans cet article, nous examinons la possibilité de faire croiser des flux de véhicules autonomes sans les arrêter afin d'éviter les congestions (retards) et la perte d'énergie. Nous proposons un contrôle aux intersections basé sur la synchronisation temporelle des véhicules. Nous décentralisons le problème en gérant indépendamment chaque intersection. Nous définissons un agent de contrôle qui est capable de synchroniser les véhicules arrivant sur une intersection, en assurant l'alternance entre les flux. Nous présentons un simulateur qui permet d'évaluer l'approche et de la comparer avec la stratégie standard des feux de circulation. Les résultats expérimentaux montrent un gain important en termes de temps et d'énergie pour les véhicules à une intersection et dans un réseau

INRIA a CCSD electronic archive server

Apprendre à agir dans un Dec-POMDP

Author: Buffet Olivier
Dibangoye Jilles
Publication venue: HAL CCSD
Publication date: 07/06/2018
Field of study

We address a long-standing open problem of reinforcement learning in decentralized partiallyobservable Markov decision processes. Previous attempts focussed on different forms of generalized policyiteration, which at best led to local optima. In this paper, we restrict attention to plans, which are simplerto store and update than policies. We derive, under certain conditions, the first near-optimal cooperativemulti-agent reinforcement learning algorithm. To achieve significant scalability gains, we replace the greedymaximization by mixed-integer linear programming. Experiments show our approach can learn to actnear-optimally in many finite domains from the literature

INRIA a CCSD electronic archive server